mathematical proof
Mathematics is hard for mathematicians to understand too Science
At a recent conference on mathematics in the age of automated proofs, mathematician and Fields Medalist Akshay Venkatesh presented “How do we talk to our students about AI?'' He quoted an email he'd received from a young student who asked, “Do you believe that mathematics is worth being studied in a world in which a machine can answer everything for you? What do you believe would be the 'job’ of a mathematician in this world?” Venkatesh framed AI as an opportunity to correct what he called an “essential gap that has opened between the practice of mathematics and our values.” Mathematician William Thurston has explained these values by writing, “mathematics is not about numbers, equations, computations, or algorithms: it is about understanding.” But Venkatesh argued that the record on this is terrible, lamenting that “for a typical paper or talk, very few of us understand it.” He is not alone in thinking that something is wrong with the current state of mathematics research.
Generating Millions Of Lean Theorems With Proofs By Exploring State Transition Graphs
Large Language Models (LLMs) have demonstrated significant potential in generating mathematical proofs. However, a persistent challenge is that LLMs occasionally make mistakes, while even a minor mistake can invalidate an entire proof. Proof assistants like Lean offer a great remedy. They are designed for verifying each step of a proof in a formal language, and in recent years researchers have created AI models to generate proofs in their languages. However, the scarcity of large-scale datasets of Lean proofs restrict the performance of such Automated Theorem Proving (ATP) models. We developed LeanNavigator, a novel method for generating a large-scale dataset of Lean theorems and proofs by finding new ways to prove existing Lean theorems. By leveraging an interactive Lean client and an efficient method for proof step generation, LeanNavigator efficiently produces new theorems with corresponding proofs. Applying this approach to Mathlib4, we generated 4.7 million theorems totaling 1 billion tokens, surpassing previous datasets by more than an order of magnitude. Using this extensive dataset, we trained an AI model that outperforms the state-of-the-art ReProver model in theorem-proving tasks. These results confirm our hypothesis and demonstrate the critical role of large datasets in improving the performance of automated theorem provers.
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Promising Solution (0.66)
- Research Report > New Finding (0.66)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
LemmaHead: RAG Assisted Proof Generation Using Large Language Models
Yang, Tianbo, Yan, Mingqi, Zhao, Hongyi, Yang, Tianshuo
Developing the logic necessary to solve mathematical problems or write mathematical proofs is one of the more difficult objectives for large language models (LLMS). Currently, the most popular methods in literature consists of fine-tuning the model on written mathematical content such as academic publications and textbooks, so that the model can learn to emulate the style of mathematical writing. In this project, we explore the effectiveness of using retrieval augmented generation (RAG) to address gaps in the mathematical reasoning of LLMs. We develop LemmaHead, a RAG knowledge base that supplements queries to the model with relevant mathematical context, with particular focus on context from published textbooks. To measure our model's performance in mathematical reasoning, our testing paradigm focuses on the task of automated theorem proving via generating proofs to a given mathematical claim in the Lean formal language.
Autograding Mathematical Induction Proofs with Natural Language Processing
Zhao, Chenyan, Silva, Mariana, Poulsen, Seth
Writing mathematical proofs has been identified as an important [1-3] and yet challenging topic [4] in computing education and mathematics education. A large body of research has shown that timely feedback is crucial to student learning [5, 6]. However, students are largely unable to receive timely feedback on written proofs due to the need to have proofs collected and hand-graded by instructors or teaching assistants. The ability to grade student proofs fully automatically with natural language processing (NLP) alleviates this need by allowing us to give students instant feedback on their proofs to let students iteratively enhance the quality of their proofs. In this paper, we propose a novel set of training methods and models capable of autograding freeform mathematical proofs, a problem at the intersection of mathematical proof education and Automatic Short Answer Grading (ASAG), by using existing NLP models and other machine learning techniques. Our proof autograder enables the development of grading systems that provide instant feedback to students without needing attention from instructors. It can also be deployed in large-scale educational platforms, allowing for more access for students. The main contributions of this paper are: Introducing the first pipeline of machine learning models capable of autograding mathematical proofs with similar accuracy to human graders Quantifying the amount of training data needed to achieve a satisfactory performance from the grading models Publishing an anonymized and labeled mathematical proof dataset that can be used in future model developments [7] Creating a set of autograded problems using the grading pipeline, and performing a user study that answers the following research questions: - Are students able to write better proofs by interacting with the autograder and the feedback it generates?
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > Utah (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Research Report > Experimental Study > Negative Result (0.48)
- Education > Curriculum > Subject-Specific Education (0.88)
- Education > Educational Setting (0.66)
- Education > Assessment & Standards (0.66)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Knowledge vs. intelligence amid the hype and hysteria over AI
Kara Frederick, tech director at the Heritage Foundation, discusses the need for regulations on artificial intelligence as lawmakers and tech titans discuss the potential risks. The current infatuation with artificial intelligence is indicative of the level of competence of those who are in the headlights of a fast-moving, still unidentified, flying object. The headlines range from "Mitigating the risk of extinction from AI" (through the European Commission) to promising a world free of disease (cancer, in particular), and unlimited prosperity. No more need for lawyers (thank God!), no more need for doctors, not to say truck drivers, and Hollywood screenwriters. AI is all over, most of the time in stealth mode – and pretty successful in every form of surveillance (there are so many).
- Leisure & Entertainment (0.69)
- Government (0.55)
- Information Technology (0.49)
- (3 more...)
Provably safe systems: the only path to controllable AGI
Tegmark, Max, Omohundro, Steve
"Once the machine thinking method had started, it would not take long to outstrip our feeble powers. At some stage therefore we should have to expect the machines to take control" Alan Turing 1951 [35] AGI [91] safety is of the utmost urgency, since corporations and research labs are racing to build AGI despite prominent AI researchers and business leaders warning that it may lead to human extinction [11]. While governments are drafting AI regulations, there's little indication that they will be sufficient to resist competitive pressures and prevent the creation of AGI. Median estimates on the forecasting platform Metaculus of the date of AGI's creation have plummeted over the past few years from many decades away to 2027 [25] or 2032 [24] depending on definitions, with superintelligence expected to follow a few years later [23]. Is Alan Turing correct that we now "have to expect the machines to take control"?
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- (4 more...)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- Government > Military (0.93)
Should AI be allowed to create Art?
The questions around AI and art are not new ones. For decades now artists have been using AI in various forms to create art, but it's only in the last few decades that we've begun to see what many of us consider to be'AI art' -- namely art made almost entirely by AI, with as little human interaction as possible. But should AI even be making art, and how will the art world react to what may very soon be the first AI maestro? What does it mean to be creative? To ask this question is really to ask what it means to be human in the first place. For millennia people have tried to answer this question and I make no claims to have done so here, but I do think it's important to distinguish between'creativity' and'apparent creativity'.
Automation: An Essential Component Of Ethical AI?
Nallur, Vivek, Lloyd, Martin, Pearson, Siani
Ethics is sometimes considered to be too abstract to be meaningfully implemented in artificial intelligence (AI). In this paper, we reflect on other aspects of computing that were previously considered to be very abstract. Yet, these are now accepted as being done very well by computers. These tasks have ranged from multiple aspects of software engineering to mathematics to conversation in natural language with humans. This was done by automating the simplest possible step and then building on it to perform more complex tasks. We wonder if ethical AI might be similarly achieved and advocate the process of automation as key step in making AI take ethical decisions. The key contribution of this paper is to reflect on how automation was introduced into domains previously considered too abstract for computers.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Europe > Germany (0.04)
Interesting Problem: Self-correcting Random Walks
Section 3 was added on October 11. Section 4 was added on October 19. A $2,000 award is offered to solve any of the open questions, click here for details. This is another off-the-beaten-path problem, one that you won't find in textbooks. You can solve it using data science methods (my approach) but the mathematician with some spare time could find an elegant solution.